The (Un)expected Effects of Applying Standard Cleansing Models to Human Ratings on Compositionality

نویسندگان

  • Stephen Roller
  • Sabine Schulte im Walde
  • Silke Scheible
چکیده

Human ratings are an important source for evaluating computational models that predict compositionality, but like many data sets of human semantic judgements, are often fraught with uncertainty and noise. However, despite their importance, to our knowledge there has been no extensive look at the effects of cleansing methods on human rating data. This paper assesses two standard cleansing approaches on two sets of compositionality ratings for German noun-noun compounds, in their ability to produce compositionality ratings of higher consistency, while reducing data quantity. We find (i) that our ratings are highly robust against aggressive filtering; (ii) Z-score filtering fails to detect unreliable item ratings; and (iii) Minimum Subject Agreement is highly effective at detecting unreliable subjects.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Effects of Economic Sanctions on Iranians’ Right to Health by Using Human Rights Impact Assessment Tool: A Systematic Review

Background Over the years, economic sanctions have contributed to violation of right to health in target countries. Iran has been under comprehensive unilateral economic sanctions by groups of countries (not United Nations [UN]) in recent years. They have been intensified from 2012 because of international community’s uncertainty about peaceful purpose of Iran’s nuclear program and inadequacy o...

متن کامل

GhoSt-PV: A Representative Gold Standard of German Particle Verbs

German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon. Similarly to other multi-word expressions, particle verbs exhibit various levels of compositionality. One of the major obstacles for the study of compositionality is the lack of representative gold standards of human ratings. In order to address this bottleneck, this ...

متن کامل

The Effects of Physical, Human and Social Capitals on the Entrepreneurship Level of Economic actors in Shahid Salimi industrial town of Tabriz: Structural Equations and Order Logit Models

The main purpose of this study is to investigate the effects of physical capital, human capital and social capital on the entrepreneurship level of individuals, using structural equations model and order logit model in Shahid Salimi industrial town of Tabriz in 2016. The data were collected form 121 economic activist who were randomly selected form the population. The empirical results show tha...

متن کامل

Detecting Compositionality of Multi-Word Expressions using Nearest Neighbours in Vector Space Models

We present a novel unsupervised approach to detecting the compositionality of multi-word expressions. We compute the compositionality of a phrase through substituting the constituent words with their “neighbours” in a semantic vector space and averaging over the distance between the original phrase and the substituted neighbour phrases. Several methods of obtaining neighbours are presented. The...

متن کامل

Non-Verbal Communication in Models of Communicative Competence and L2 Teachers’ Rating

Non-verbal communication (NVC) plays a major role in various aspects of human life (Andersen, 2004; Cameron, 2001; Johnstone, 2008). Children learning their first language come to realize non-verbal communication as their socialization process takes place (Fletcher & German, 1990; Ingram, 1996; Owens, 2001). However, most EFL learners may have little exposure to these non-verbal aspects of comm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013